Dialogue Act DISCOURSE LEARNING: Tagging with Transformation-Based
نویسنده
چکیده
My central goal is to compute dialogue acts automatically. A dialogue act is a concise abstraction of a speaker’s intention, such as SUGGEST and REQUEST. Recognizing dialogue acts is critical to understanding at the discourse level, and dialogue acts can also be useful for other applications, such as resolving ambiguity in speech recognition. But, often, a dialogue act cannot be directly inferred from a literal reading of an utterance. Machine learning offers promise as a means of discovering patterns in corpora of data, since the computer can efficiently analyze large quantities of information. My research is the first to investigate using Brill’s (1995) Transformation-Based Learning (TBL) algorithm to compute dialogue acts (Samuel, Carberry, Vijay-Shanker 1998). There are several reasons that I selected this machine learning method over the alternatives for my task: TBL has been applied successfully to a similar problem, Part-of-Speech Tagging (Brill 1995); TBL produces an intuitive model; TBL can easily accommodate local context as well as distant context; TBL demonstrates resistance to overfitting; etc. To address some limitations of the original TBL algorithm and to deal with the particular demands of discourse processing, I developed some extensions to my system, including a Monte Carlo approach that randomly samples from the space of available rules, rather than exhaustively generating all possible rules. This significantly improves efficiency without compromising accuracy (Samuel 1998). Also, to circumvent a sparse data problem, it is necessary to transform the input data by extracting values for a set of simple features of utterances, such as cue phrases and nearby dialogue acts. I am utilizing a very general set of cue phrases that includes patterns such as "but", "thanks", "what time", and "busy". To automatically collect those cue phrases that appear frequently in dialogue and provide useful clues to help determine the appropriate dialogue acts, I devised an entropy approach (selecting words so that the dialogue acts co-occurring with those words have low entropy) with a filtering mechanism (removing cue phrases that merely provide redundant information). Other researchers have been investigating machine
منابع مشابه
Development of a Machine Learnable Discourse Tagging Tool
We have developed a discourse level tagging tool for spoken dialogue corpus using machine learning methods. As discourse level information, we focused on dialogue act, relevance and discourse segment. In dialogue act tagging, we have implemented a transformation-based learning procedure and resulted in 70% accuracy in open test. In relevance and discourse segment tagging, we have implemented a ...
متن کاملAn Investigation of Transformation-Based Learning in Discourse
This paper presents results from the first attempt to apply Transformation-Based Learning to a discourse-level Natural Language Processing task. To address two limitations of the standard algorithm, we developed a Monte Carlo version of TransformationBased Learning to make the method tractable for a wider range of problems without degradation in accuracy, and we devised a committee method for a...
متن کاملDiscourse Learning: Dialogue Act Tagging with Transformation-Based Learning
My central goal is to compute dialogue acts automatically. A dialogue act is a concise abstraction of a speaker's intention, such as SUGGEST and REQUEST. Recognizing dialogue acts is critical to understanding at the discourse level, and dialogue acts can also be useful for other applications, such as resolving ambiguity in speech recognition. But, often, a dialogue act cannot be directly inferr...
متن کاملPosting Act Tagging Using Transformation-Based Learning
In this article we present the application of transformation-based learning (TBL) [1] to the task of assigning tags to postings in online chat conversations. We define a list of posting tags that have proven useful in chat-conversation analysis. We describe the templates used for posting act tagging in the context of template selection. We extend traditional approaches used in part-of-speech ta...
متن کاملA semantic tagging tool for spoken dialogue corpus
In this paper, we report our semantic tagging tool for spoken dialogue corpus. This tagging tool can acquire analysis rules using Transformation-based Learning (TBL) from small scale training corpus. It can learn dialogue act tagging rules and semantic frame tagging rules. The precisions are 72% in dialogue act tagging and 58% of semantic frame tagging in open test.
متن کامل